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RESEARCH  OF  RICHARD  E.  BARLOW 

I  will  describe  my  research  progress  and  significant  results  in  terms  of 
three  areas  of  research  interest;  namely  1)  System  reliability.  2)  Combination 
of  opinions;  and  3)  Bayesian  statistical  applications  to  data  analysis  and 
quality  assurance. 

1.  System  Reliability 

Perhaps  the  most  important  contribution  was  the  generalization  and 
simplified  proof  of  the  signed  domination  theorem  [cf.  Set  Theoretic  Signed 
Domination  for  Coherent  Systems  (1982)  and  Computational  Complexity  of 
Coherent  Systems  and  the  Reliability  Polynomial  (1985)].  The  signed 
domination  theorem  lies  at  the  heart  of  the  proof  of  the  topological  formula 
and  many  other  key  results  in  network  reliability  theory.  It  is  a  unifying 
result,  some  of  whose  applications  were  also  described  in  "A  Survey  of  Network 
Reliability  and  Domination  Theory"  (1984). 

In  system  reliability  prediction,  one  of  the  most  difficult  problems 
(especially  if  the  classical  statistics  approach  is  used)  is  to  combine 
component  and  system  failure  data.  A  Bayesian  approach  based  on  calculating 
the  posterior  variance  is  described  in  "Combining  Component  and  System 
Information  in  System  Reliability  Calculation"  (1985)  and  also  in  "Assessing 
the  Reliability  of  Computer  Software..."  (1985). 


Two  important  research  questions  were  addressed  in  different 
publications.  The  first  question  is.  How  should  a  decision  maker  combine  the 
opinions  from  several  experts  about  an  unknown  quantity?  In  "Combination  of 
Experts’  Opinions  Based  on  Decision  Theory"  (1986)  an  approach  was  suggested 
for  the  case  when  it  is  not  appropriate  for  the  decision  maker  to  exercise 
more  than  minimal  judgement  as  in  the  case  of  a  government  agency.  The  second 
question  is.  How  should  a  group  reach  a  consensus  relative  to  an  unknown 
quantity,  based  on  their  possibly  very  different  opinions?  The  group  Pareto 
optimal  decisions  are  characterized  in  "The  Group  Consensus  Problem"  (1985).  A 
result  concerning  Pareto  optimal  group  decisions  (which  was  mistakenly 
attributed  to  de  Finetti)  is  refined  and  generalized.  It  turns  out  that  de 
Finetti’s  paper  (which  was  in  Italian)  actually  contains  a  different  set  of 
results.  The  translation  and  investigation  of  the  implications  of  de 
Finetti’s  important  results  are  still  being  pursued. 

3.  Bayesian  Statistical  Applications 

The  Bayesian  approach  is  used  in  "A  Critique  of  Deming's  Discussion  of 
Acceptance  Sampling  Procedures"  (1986)  to  correct  Deming’s  rule  that  in 
inspection  sampling  the  only  rule  which  should  be  followed  is  the  "all  or 
none"  inspection  rule.  This  is  true  if  the  percentage  defective  in  a  lot  is  a 
priori  fairly  well  specified  but  not  if  there  is  sufficient  initial 
uncertainty.  Computing  algorithms  are  given  together  with  elegant  analytical 
solutions  for  special  cases. 

In  "Informative  Stopping  Rules"  (198*1)  the  case  when  the  stopping  rule 


is  informative  relative  to  some  examples  which  arose  in  practice  is  examined 


in  detail.  This  is  important  because  almost  all  models  in  the  literature 
assume  that  the  stopping  rule  is  noninf ormative . 

A  recent  paper  "Using  Influence  Diagrams  to  Solve  the  Calibration 
Problem"  considers  the  problem  of  designing  an  experiment  to  calibrate  a 
measuring  instrument.  A  numerical  algorithmic  solution  is  provided  for  the 
case  when  the  prior  distributions  are  multivariate  normal.  The  number  of 
required  integrations  is  reduced  to  three.  This  means  the  problem  can  be 
solved  on  a  desktop  computer.  In  general  the  problem  will  require  a  much 
larger  computer.  From  a  theoretical  standpoint,  the  most  interesting  result 
is  that,  unlike  the  usual  linear  regression  experimental  design  problem,  the 
optimal  design  in  the  inverse  linear  regression  problem  does  not.  in  general, 
lie  on  the  boundary  of  the  feasible  region. 
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RESEARCH  OF  WILLIAM  S.  JEWELL 

Research  progress  and  significant  results  occurred  in  three  areas  of 
research  interest:  l)  Reliability  Growth  and  Software  Reliability;  2) 
Bayesian  Approximation  Methods;  and  3)  Risk  Portfolio  Problems.  Additionally, 
work  in  progress  is  described  under  (4)  Hierarchical  Models.  These  areas  are 
not  mutually  exclusive,  since  many  papers  are  related  to  each  other. 


(1)  Reliability  Growth  and  Software  Reliability 

The  major  contribution  in  reliability  growth  modelling  was  "A  General 
Framework  for  Learning  Curve  Reliability  Growth  Models"  (1983  A).  Results 
from  this  general  learning-curve  model  demonstrate  that  it  is  very  difficult 
to  make  predictions  of  the  ultimate  failure  rate  of  a  stochastic  learning 
curve  from  limited-interval  initial  data,  unless  a  very  large  number  of 
systems  are  on  test  simultaneously,  because  the  data  likelihood  is  very  broad. 
This  is  true  even  when  the  initial  failure  rate  and  the  learning-curve  form 
are  known  exactly,  and  casts  doubt  upon  both  classical  and  Bayesian  point 
estimates  of  ultimate  reliability. 


Reliability  growth  can  also  occur  when  discrete  defects  (out  of  some 


finite,  but  unknown  number)  are  removed  during  an  initial  inspection  testing 
program,  as  in  software  reliability  models.  The  objective  is  to  estimate  the 
number  of  defects  remaining  after  the  inspection  is  terminated.  "Bayesian 


.ry 


Estimation  of  Undetected  Errors"  (1983  B)  treats  the  multiple-inspector. 


fixed-effort  case,  where  both  the  error  detection  efficiencies  and  the  numbei 


of  bugs  are  unknown  a  priori;  the  full-distributional  results  also  provide  a 


Bayesian  generalization  to  a  well-known  capture-recapture  biometric  formula. 


"Bayesian  Extension  to  a  Basic  Model  of  Software  Reliability"  (1985  C) 


analyzes  the  single-inspector  continuous-time  model,  giving  a  similar  Bayesian 


prediction  of  the  distribution  of  undetected  errors  when  testing  is  stopped. 


as  well  as  an  updated  estimate  of  the  detection  rate  parameter.  (1985  D) 


analyzes  the  dynamics  of  these  estimates  when  there  is  a  single  "probable 


object"  with  prior  probability,  which  remains  unfound  as  time  progresses;  this 


model  is  related  to  international  incidents  of  territorial  intrusion. 


f21  Bayesian  Approximation  Methods 


For  many  years  I  have  been  interested  in  linearized  approximation  to 


Bayesian  predictions,  referred  to  in  actuarial  articles  as  "credibility 


theory."  Previous  interim  reports  have  described  the  development  of  this,  by 


now.  rich  and  varied  field. 


’Enriched  Multinomial  Priors  Revisited"  (1982  E)  corrects  and  updates  am 


earlier  paper  on  the  practically  important  model  of  the  multinomial  likelihood 


with  unknown  mean  vector  and  precision  matrix.  The  traditional  Normal-Wishar t 


prior  has  the  inconvenience  of  being  too  "thin"  (too  few  hyperparameters) , 


which  also  makes  the  Bayesian  mean  and  covariance  predictions  too  simple. 


compared  to  a  multi-dimensional  credibility  approximate  forecast.  The  main 


result  of  this  paper  is  a  new  prior  joint  distribution  for  the  means  and 


precisions  that  corrects  this  thinness. 
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"Credibility  Approximations  for  Bayesian  Prediction  of  Second  Moments" 


(1984  F)  (joint  with  R.  Schneiper)  extends  the  basic  model  of  the  credible 
mean  to  the  problem  of  approximating  the  second  moments  of  the  predictive 
distribution  as  a  linear  combination  of  natural  first-  and  second-order 
statistics;  exact  results  for  many  important  analytic  densities  also  use 
various  combinations  of  these  statistics.  Assuming  that  the  various  (up  to 
fourth-order)  hyper parameters  can  be  estimated,  the  joint  moment  forecasts 
involve  the  inversion  of  a  3  by  3  matrix. 

(31  Risk  Portfolio  Problems 

Variations  of  the  compound  law,  which  governs  the  sum  of  a  random  number 
of  random  variables,  are  often  used  to  describe  an  Individual  risk  (insurance 
contract)  which  undergoes  a  random  number  of  random-sized  financial  shocks  in 
a  fixed  period,  or  a  risk  portfolio  composed  of  such  risks.  "Approximating 
the  Distribution  of  a  Dynamic  Risk  Portfolio"  (1983  G)  is  a  typical 
model -development  paper  in  this  area  that  examines  the  case  in  which  the 
composition  of  the  portfolio  is  also  random. 

Additionally,  the  exact  compound  distribution  and  its  variants  are 
notoriously  difficult  to  compute  exactly  because  of  the  need  for  high-order 
convolutions;  for  many  years,  approximations  based  on  the  normal  distribution 
were  the  preferred  approach.  Then,  H.  Panjer  and  others  discovered  that,  for 
a  certain  class  of  counting  ("frequency”  of  shocks)  distributions,  and  for 
discrete  and  positive  shock  value  ("severity")  distributions,  one  could  set  up 
recursive  formulae  for  calculating  the  distribution  of  total  amount  ("loss"). 

A  joint  paper  with  B.  Sundt  ( 1  OS  1  H)  provided  extensions  to  Panjer* s  result. 
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Then,  in  1984-5,  R.  Milidiu  did  his  thesis  research  on  the  extension  to 
the  important  case  where  the  ’’frequency"  is  Negative  Binomial  and  the 
"severity"  can  have  both  negative  and  positive  values;  Panjer-type  formulae 
still  exist,  but  it  is  now  impossible  to  "get  started"  on  a  recursive  method. 
However,  various  iterative  and  approximation  approaches  suggest  themselves, 
and  a  variety  of  such  strategies  were  explored  computationally.  Initial 
results  are  reported  in  the  Joint  paper  "Strategies  for  Computation  of 
Compound  Distributions  with  Two-Sided  Severities."  (1986  I) 

(41  Hierarchical  Models 

Research  effort  in  1986  has  focussed  on  hierarchical  models,  in  which 
"cohort  data,"  generated  using  different  values  of  the  underlying  risk 
parameter,  is  used  to  assist  the  primary  prediction  process.  The  necessary 
correlation  between  the  unknown  different  parameters  is  explicated  by  assuming 
that  they,  in  turn,  depend  upon  some  unknown  hyper-prior  parameter  and 
distribution,  thus  giving  a  heirarchy  of  random  quantities: 
observables-parameters-hyperparameter.  This  makes  the  parameters  of  the 
various  cohort  components  exchangeable  rvs.  In  spite  of  the  obvious  practical 
impact  of  a  hierarchical  model,  few  analytic  results  are  known  -  primarily  for 
the  normal -normal -normal  (fixed  variances)  formulation  due  to  Lindley  and 
Smith.  The  author  analyzed  the  predictive  hierarchical  mean  from  the 
credibility  point  of  view  in  a  1975  paper. 

Based  upon  this  paper  and  second  moment  results  described  in  (1984  F), 
Hans  Buhlmann  (ETH,  Zurich)  and  I  have  developed  a  simul taneous  first-  and 
second-moment  credibility  prediction  method,  which  uses  all  possible  first- 
and  second-  order  statistics  that  can  be  found  from  cohort  data  components. 
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The  resulting  least-squares  analysis  can  easily  be  carried  out.  but  the  model 
requires  a  large  number  of  hyper-moments  to  be  obtained.  Asymptotic  results, 
for  a  large  number  of  data  points,  or  for  a  large  number  of  cohort  components, 
are  of  interest,  and  give  insight  into  how  an  empirical  Bayes  estimation  of 
variance  should  proceed. 

Also,  to  provide  an  exact  analytic  formulation  against  which  to  test  the 
above  computations,  the  author  has  been  able  to  generalize  the 
normal-normal-normal  model  in  a  heteroscedastic  manner,  by  permitting  unknown, 
but  linked,  variances  at  each  level  of  the  hierarchy. 

Both  of  these  papers  will  appear  shortly. 
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My  research  under  grant  AF0SR-81-0122  has  fallen  into  four  main 
categories:  namely  (1)  Simulation:  (2)  Software  Reliability:  Estimation  and 


Testing;  (3)  Reliability  Models:  and  (4)  Peaks  from  Random  Data. 

1 •  Simulation 

Almost  all  dynamic  reliability  systems  can  be  modelled  as  Markov 
processes  either  in  discrete  or  continuous  time  and  a  basic  question  is  to 
determine  the  time,  starting  from  a  given  initial  state,  until  the  process 
enters  a  state  that  is  considered  failed.  As  such  distributions  are  usually 
difficult  to  evaluate  analy tical ly .  a  simulation  analysis  was  presented  by 
Ross  and  Schechner  in  [1]  "Using  Simulation  to  Estimate  First  Passage 
Distributions . " 

Specif ically.  they  considered  a  discrete  time  Markov  process 
{X^  ,  n  =  0,  1,...}  such  that  whenever  the  present  state  is  x  the  next  state 
is  chosen  according  to  the  distribution  P  .  The  initial  state  i  was  fixed 
and  for  a  given  set  of  states  A  they  were  interested  in  estimating  the 
distribution  and  the  mean  of  N  ,  the  number  of  transitions  until  the  Markov 
process  enters  the  set  A  ,  by  use  of  simulation.  By  standard  techniques  such 
a  chain  can  be  simulated  until  it  reaches  A  — call  each  such  simulation  a 
run.  It  was  then  shown  that  estimators  based  on 

N+l 

N  +  1  -  2  P„  (A)  . 

3=2  * 3-2 


where  N  is  the  number  of  steps  taken  in  a  given  run,  and  X  is  the  j 

J 

state  in  that  run  and  Px(A)  is  the  probability  of  going  from  x  to  the  set  A 
in  a  single  run,  has  the  same  mean  and  smaller  variance  than  the  usual 
estimator  N  .  Hence,  the  average  overall  runs  of  this  quantity  is  a  better 
estimate  of  E(N)  than  is  the  average  run  size.  In  addition,  a  second 
estimate,  based  on  the  observed  hazard  rate,  was  given. 

Another  important  problem  from  a  reliability  application  viewpoint  is  the 
estimation  of  the  distribution  of  the  final  state.  This  is  important  since  it 
represents  the  failed  state  and  thus  repair  will  depend  on  it.  Such  an 
estimate  was  presented  by  working  with  a  modified  version  of  the  hazard  rate 
function.  Specifically,  let  BCA  and  define  Ng  to  equal  the  number  of 
transitions  needed  to  reach  B  in  a  run  (and  thus  Ng  =  “  if  the  final  state 
is  in  A  -  B).  Rather  than  estimating  the  hazard  rate  function  of  ND  ,  namely 

D 

P{Ng  =  n  |  Ng  £  n}  .  the  modified  version  P(Ng  =  n  [  N  £  n}  was  employed, 
and  an  estimator  based  on  this  was  g*ven. 

A  second  research  report  dealing  with  simulation  was  the  report  [2] 


(joint  with  Z.  Schechner)  entitled  "Simulation  Uses  of  the  Exponential 
Distribution."  This  paper  showed  how  simulated  values  from  an  exponential 
distribution  could  be  effectively  used  to  simulate  such  diverse  quantities  as 
normal  order  statistics,  multi-dimensional  Poisson  processes  and 


Consider  a  complicated  system  that  originally  has  m  defects.  Defect  i 
will  cause  a  system  failure  after  a  random  time  that  is  exponentially 
distributed  with  rate  Xj  ,  i  =  1 . m  .  All  of  the  quantities 

m,  X,,...,  X  are  assumed  to  be  unknown.  The  system  is  to  be  run  for  a  time 

l  m 

t  .  with  all  failures  that  occur  being  repaired  and  the  defects  that  caused 
the  failures  being  noted.  The  problem  is  to  estimate  the  resulting  failure 
rate  given  that  all  defects  that  caused  failures  in  (O.t)  are  eliminated. 
Specifically,  letting 


*j(t)  =' 


1  if  defect  i  does  not  cause  a  failure  by  time  t 
0  otherwise 


then  we  want  to  estimate 


A(t)  =  2  X  *  (t) 

i=l 


In  [3]  and  [4]  Ross  presented  and  analyzed  the  estimator 


D(t)  *t/Fi 
y  - - - 

/  i  . 


ill  V1—^) 


In  addition  a  stopping  rule  to  enable  one  to  decide  when  to  stop  the  testing 
phase  and  conclude  that  the  remaining  error  rate  is  below  some  preassigned 
value  was  developed  in  [4]. 
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3.  Reliability  Models 

In  [5]  Ross  considered  an  n  component  system  such  that  each  component 
is  initially  on  and  stays  on  for  a  random  time  at  which  it  fails.  The  problem 
of  interest  is  to  characterize  the  distribution  of  the  time  until  the  system 
fails.  Whereas  this  problem  is  usually  considered  under  the  assumption  that 
the  component  lives  are  independent,  the  model  in  [5]  supposes  a  Markovian 
model  in  which  the  failure  rate  of  a  given  component  at  any  time  is  allowed  to 
depend  on  the  set  of  working  components  at  that  time.  Specifically,  it 
supposes  that  if  at  some  time  W  .  WC{1,  2.....  n}  ,  represents  the  set  of 
working  components  then  for  i  €  W  the  instantaneous  failure  rate  for 
component  i  is  ^(W). 

Specific  conditions  that  imply  that  the  system  life  is  IFR  and  IFRA  are 
presented.  A  method  for  easily  simulating  the  process  is  also  presented. 
Finally  the  model  is  generalized  to  allow  for  the  repair  of  failed  components 
and  conditions  implying  that  the  process  is,  in  steady  state,  time  reversible 
are  presented. 

In  [6]  Ross  and  Schechner  considered  some  reliability  applications  of  the 

variability  ordering  where  if  Xj  and  Xg  are  random  variables  having 

respective  distributions  Fj  and  F 2  •  then  we  say  that  X^  i  X£  (read  Xj 

v 

is  less  variable  than  if 

00  €0 

S  f (x)dF  (x)  <  S  f(x)dF2(x) 

—00  —00 

for  all  increasing  convex  functions  f  . 

Applications  to  a  variety  of  shock  and  survival  models  were  presented. 


In  [7]  Derman.  Lieberman  and  Ross  considered  the  problem  of  using 
replacement  to  continually  extend  the  life  of  a  system.  It  was  supposed  that 
there  was  a  single  vital  component  which  would  cause  a  catastrophe  if  it 
failed  while  in  use.  By  successively  determining  the  times  to  replace  this 
vital  component  by  one  of  a  finite  number  of  remaining  spares  the  optimal 
policy  was  categorized. 

4.  Peaks  in  Random  Data 

In  an  influential  and  controversial  paper.  Raup  and  Sepkoski 
("Periodicity  of  Extinction  in  the  Geologic  Past."  Proceedings  of  the  National 
Academy  of  Sciences  of  the  U.  S. .  81,  pp.  801-805,  1984)  analyzed  data 
relating  to  the  proportion  of  families  that  became  extinct  in  each  of  39  time 
periods  of  (average)  length  6.2  million  years.  They  defined  an  event  of  mass 
extinction  to  have  occurred  in  any  time  period  whose  data  value  exceeded  that 
of  its  Immediate  predecessor  and  follower.  Stating  that  the  data  indicated  a 
periodicity  of  mass  extinctions,  they  then  presented  a  statistical  analysis 
which  they  claim  verified  the  above,  and  invalidated  the  previously  held 
belief  that  such  data  behaved  as  a  random  walk  whose  incremental  change 
distribution  is  symmetric  about  0. 

The  statistical  analysis  of  Raup-Sepkoski  compared  the  observed  value  of 
a  proposed  statistic  with  all  39!  possible  other  values  when  the  data  points 
are  permuted.  However,  as  noted  by  Ross  in  [8]  such  a  permutation  test  is 
only  meaningful  if  the  set  of  alternative  hypotheses  are  such  that, 
conditional  on  the  set  of  data  values,  all  39!  possible  orderings  are  equal lv 
likely  That  is.  such  a  test  is  meaningful  if  one  is  testing  periodicity 


against  the  alternative  hypothesis  that  the  data  values  constitute  a  random 
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sample  from  some  arbitrary  probability  distribution.  It  is  not  a  meaningful 
test  if  the  alternative  is  that  the  incremental  changes  of  the  data  constitute 
a  random  walk.  In  addition  it  was  then  shown  in  Ross  [8]  by  a  nonparametr ic 
analysis  which  employed  simulation  to  test  for  goodness-of-f i t  that  the  random 
walk  model  is  perfectly  consistent  with  the  observed  data. 

Let  Xj.Xg....  be  a  sequence  of  random  variables  and  say  that  a  peak 

occurs  at  time  n  if  X  .  <  X  <  X  When  the  random  sequence  constitutes 

n-l  n  n+l 

a  random  walk  whose  incremental  change  distribution  is  symmetric  about  0  then, 
as  noted  in  [8],  the  process  of  peaks  constitutes  a  renewal  process.  However, 
when  the  X^  constitute  a  random  sample  from  a  continuous  distribution  then 
this  is  no  longer  true.  Indeed,  in  this  situation  the  times  between 
successive  peaks  are  neither  independent  nor  identically  distributed. 

The  process  of  peaks,  when  the  data  constitutes  a  random  sample  from  a 
continuous  distribution,  is  analyzed  by  Ross  in  [9].  It  is  shown  that  N(n), 
the  number  of  peaks  by  time  n  ,  is  asymptotically  normal  with  mean  (n-l)/3 
and  variance  (2n+4)/45.  In  addition,  it  is  shown  that,  with  probability  1. 
lira  N(n)/n=l/3.  Finally,  it  is  argued  that  the  proportion  of  interpeak  times 
that  are  equal  to  J  converges,  with  probability  1,  to  a  constant  value  - 
call  it  Pj  •  The  values  of  the  Pj  are  then  given  in  terms  of  computable 
integrals;  and  in  particular  it  is  shown  that 


p„  =  2/5  p_=l/3  p  .=6/35  pc;=l/15  p.=. 021 16401 
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